Sign inSign up
v1.0 MVP Solo Founder 6 Weeks

Technical Architecture

Production-ready SaaS stack for ModelPilot AI โ€” a multi-tenant no-code AI chatbot platform. Optimized for solo founder velocity with clean abstractions that scale.

Tech Stack

โšก
Next.js 14
Frontend ยท App Router ยท TypeScript
๐Ÿ
FastAPI
Backend ยท Python ยท Async
๐Ÿ—„๏ธ
Supabase
PostgreSQL ยท Auth ยท Storage ยท Realtime
๐Ÿ”ฎ
Qdrant
Vector DB ยท RAG ยท Semantic Search
๐Ÿ”€
LiteLLM
AI Gateway ยท Multi-provider routing
๐Ÿ”ด
Redis + Celery
Task queue ยท Document processing
๐Ÿš‚
Railway
Backend hosting ยท Auto-deploy
โ–ฒ
Vercel
Frontend hosting ยท Edge network

What NOT to Build in MVP

โŒ Skip in MVP
Custom model fine-tuning
Mobile app (iOS/Android)
Real-time collaboration
Advanced analytics dashboards
Zapier / Slack integrations
Custom LLM hosting
White-label platform
โœ“ Ship in MVP
Multi-provider AI routing
PDF/URL knowledge ingestion
Embeddable chat widget
Multi-tenant auth
Usage + cost tracking
Basic conversation logs
Stripe billing

6-Week Sprint Plan

Sequenced for a solo founder. Each week ships something live. Ship early, iterate fast.

W1
Foundation
Project scaffolding + Auth + Multi-tenancy
Init Next.js 14 (App Router) + FastAPI monorepo
Supabase project: Auth (email + Google OAuth), RLS policies
PostgreSQL schema: workspaces, users, workspace_members
JWT middleware on FastAPI, tenant context injection
Deploy skeleton to Vercel + Railway
Basic dashboard shell UI (sidebar, routing)
W2
Core AI
LiteLLM gateway + Chat endpoint + Chatbot builder
LiteLLM proxy setup with OpenAI, Anthropic, Gemini
POST /chat endpoint with SSE streaming
Chatbot CRUD API + DB tables
AI provider key management (encrypted storage)
Chatbot builder UI (name, model, system prompt)
Basic conversation logging to PostgreSQL
W3
Knowledge
RAG pipeline + Document ingestion + Qdrant
Qdrant Cloud setup, collection per workspace
Celery + Redis for async document processing
PDF/DOCX text extraction (pdfplumber, python-docx)
URL scraping (BeautifulSoup + trafilatura)
Chunking + embedding pipeline (text-embedding-3-small)
RAG retrieval injected into chat context
Knowledge base UI (upload, status, list)
W4
Widget
Embeddable chat widget + Public chat API
Public /widget/:botId endpoint (no auth, CORS open)
Vanilla JS widget (zero dependencies, <8KB gzipped)
CDN hosting (Cloudflare R2 or Supabase Storage)
Widget customization: colors, greeting, position
Embed code generator UI
Rate limiting per botId (Redis sliding window)
W5
Billing
Stripe billing + Usage tracking + Limits
Stripe Checkout + webhook handler (subscription CRUD)
Usage metering: messages, tokens, cost per workspace
Plan enforcement middleware (message caps, bot limits)
Pricing page + upgrade flow UI
Billing portal redirect (Stripe Customer Portal)
Email alerts: usage at 80% / 100% cap
W6
Launch
Polish + Onboarding + Analytics + Launch
Onboarding flow (workspace โ†’ provider โ†’ bot โ†’ widget)
Dashboard analytics (charts, cost, conversation logs)
Conversation log viewer with transcript export
Team invites (email-based, role assignment)
Error handling, empty states, loading states everywhere
PostHog analytics + Sentry error tracking
Product Hunt / launch prep, pricing live

Database Schema

PostgreSQL via Supabase. Multi-tenant using workspace_id on every table + Row Level Security (RLS). All timestamps in UTC.

Core Tables

workspacesโ€” Top-level tenant
iduuidPKPrimary key, default gen_random_uuid()
nametextWorkspace display name
slugtextURL-safe identifier, unique
plantext'starter' | 'pro' | 'enterprise'
stripe_customer_idtextStripe customer reference
message_quotaintegerMonthly message limit (5000/50000/โˆž)
created_attimestamptzdefault now()
workspace_membersโ€” User โ†” Workspace join
user_iduuidFKโ†’ auth.users.id
workspace_iduuidFKโ†’ workspaces.id
roletext'admin' | 'editor' | 'viewer'
invited_byuuidFKโ†’ auth.users.id (nullable)
joined_attimestamptznull = pending invite
chatbotsโ€” Bot configuration
iduuidPKPrimary key
workspace_iduuidFKโ†’ workspaces.id (RLS partition)
nametextDisplay name e.g. "Support Bot"
system_prompttextLLM system instruction
modeltext'gpt-4o' | 'claude-3-5-sonnet' | ...
provider_iduuidFKโ†’ ai_providers.id
temperaturefloat4default 0.7
max_tokensintegerdefault 1024
widget_configjsonb{color, greeting, position, avatar}
statustext'draft' | 'live' | 'paused'
created_attimestamptzdefault now()
ai_providersโ€” Encrypted API keys per workspace
iduuidPK
workspace_iduuidFKโ†’ workspaces.id
providertext'openai' | 'anthropic' | 'google' | 'groq'
api_key_enctextAES-256 encrypted, never returned to client
monthly_budgetnumericUSD cap, null = unlimited
rate_limit_rpmintegerRequests per minute cap
is_activebooleandefault true
knowledge_documentsโ€” Indexed source files
iduuidPK
workspace_iduuidFKโ†’ workspaces.id
chatbot_iduuidFKโ†’ chatbots.id (nullable = global)
source_typetext'pdf' | 'url' | 'docx' | 'faq' | 'txt'
source_urltextStorage URL or scraped URL
filenametextOriginal filename
chunk_countintegerVectors stored in Qdrant
statustext'pending' | 'processing' | 'indexed' | 'error'
error_msgtextnullable, set on error
created_attimestamptz
conversationsโ€” Chat session headers
iduuidPK
chatbot_iduuidFKโ†’ chatbots.id
workspace_iduuidFKโ†’ workspaces.id (denormalized for RLS)
session_idtextClient-generated UUID (widget visitor)
user_identifiertextEmail or anonymous ID from widget
model_usedtexte.g. 'gpt-4o'
total_tokensintegerAccumulated across all messages
total_cost_usdnumericRunning cost in USD
statustext'active' | 'resolved' | 'handoff'
started_attimestamptz
ended_attimestamptznullable
messagesโ€” Individual chat turns
iduuidPK
conversation_iduuidFKโ†’ conversations.id
roletext'user' | 'assistant' | 'system'
contenttextMessage text
tokensintegerToken count for this message
sourcesjsonbRAG chunks used [{doc_id, score, excerpt}]
latency_msintegerTime to first token (ms)
created_attimestamptz
usage_eventsโ€” Append-only metering log
idbigserialPK
workspace_iduuidFK
event_typetext'chat_message' | 'doc_indexed' | 'widget_load'
tokens_usedintegernullable
cost_usdnumeric(10,6)nullable
providertextnullable
modeltextnullable
created_attimestamptzdefault now()

Folder Structure

Monorepo with apps/web (Next.js) and apps/api (FastAPI). Shared types via packages/types.

bash
# Root monorepo
modelpilot/
โ”œโ”€โ”€ apps/
โ”‚   โ”œโ”€โ”€ web/                        # Next.js 14 frontend
โ”‚   โ”‚   โ”œโ”€โ”€ app/
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ (auth)/             # Login, signup, onboarding
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ (dashboard)/        # Authenticated app shell
โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ layout.tsx      # Sidebar + topbar wrapper
โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ page.tsx        # Dashboard
โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ chatbots/
โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ knowledge/
โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ providers/
โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ logs/
โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ widget/
โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ team/
โ”‚   โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ pricing/
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ api/                # Next.js API routes (thin proxies)
โ”‚   โ”‚   โ”œโ”€โ”€ components/
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ ui/                 # Button, Input, Badge, Modal...
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ chatbot/            # BotCard, BotEditor, ChatPreview
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ knowledge/          # UploadZone, DocTable, FAQEditor
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ widget/             # WidgetPreview, EmbedCode
โ”‚   โ”‚   โ”œโ”€โ”€ lib/
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ api.ts              # Typed fetch wrapper
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ supabase.ts         # Supabase client
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ hooks/              # useWorkspace, useChatbots, ...
โ”‚   โ”‚   โ””โ”€โ”€ middleware.ts           # Auth guard + workspace redirect
โ”‚   โ”‚
โ”‚   โ””โ”€โ”€ api/                        # FastAPI backend
โ”‚       โ”œโ”€โ”€ main.py                 # App entry, middleware, routers
โ”‚       โ”œโ”€โ”€ routers/
โ”‚       โ”‚   โ”œโ”€โ”€ chat.py             # POST /chat (SSE stream)
โ”‚       โ”‚   โ”œโ”€โ”€ chatbots.py         # CRUD /chatbots
โ”‚       โ”‚   โ”œโ”€โ”€ knowledge.py        # Upload, list, delete
โ”‚       โ”‚   โ”œโ”€โ”€ providers.py        # API key management
โ”‚       โ”‚   โ”œโ”€โ”€ widget.py           # Public widget endpoint
โ”‚       โ”‚   โ”œโ”€โ”€ analytics.py        # Usage, cost, logs
โ”‚       โ”‚   โ””โ”€โ”€ billing.py          # Stripe webhooks
โ”‚       โ”œโ”€โ”€ services/
โ”‚       โ”‚   โ”œโ”€โ”€ llm.py              # LiteLLM wrapper
โ”‚       โ”‚   โ”œโ”€โ”€ rag.py              # Qdrant search + context build
โ”‚       โ”‚   โ”œโ”€โ”€ ingestion.py        # Chunking + embedding pipeline
โ”‚       โ”‚   โ”œโ”€โ”€ billing.py          # Stripe + usage metering
โ”‚       โ”‚   โ””โ”€โ”€ encryption.py       # AES-256 for API keys
โ”‚       โ”œโ”€โ”€ workers/
โ”‚       โ”‚   โ””โ”€โ”€ tasks.py            # Celery tasks (doc processing)
โ”‚       โ”œโ”€โ”€ models/
โ”‚       โ”‚   โ””โ”€โ”€ schemas.py          # Pydantic request/response models
โ”‚       โ”œโ”€โ”€ middleware/
โ”‚       โ”‚   โ”œโ”€โ”€ auth.py             # JWT verify + tenant inject
โ”‚       โ”‚   โ””โ”€โ”€ rate_limit.py       # Redis sliding window
โ”‚       โ””โ”€โ”€ db/
โ”‚           โ”œโ”€โ”€ client.py           # Supabase + asyncpg connection
โ”‚           โ””โ”€โ”€ queries.py          # Raw SQL helpers
โ”‚
โ”œโ”€โ”€ packages/
โ”‚   โ””โ”€โ”€ types/                      # Shared TS types
โ”‚
โ”œโ”€โ”€ widget/                         # Embeddable widget (vanilla JS)
โ”‚   โ”œโ”€โ”€ src/widget.ts
โ”‚   โ”œโ”€โ”€ dist/widget.js              # Built, hosted on CDN
โ”‚   โ””โ”€โ”€ rollup.config.js
โ”‚
โ”œโ”€โ”€ docker-compose.yml              # Redis + Qdrant local dev
โ””โ”€โ”€ .env.example

API Routes

REST API on FastAPI. Base URL: https://api.modelpilot.ai/v1. All routes require Authorization: Bearer <jwt> except widget endpoints.

Chatbots

GET/chatbotsList all chatbots for workspace
POST/chatbotsCreate a new chatbot
GET/chatbots/:idGet chatbot by ID
PUT/chatbots/:idUpdate chatbot config
DELETE/chatbots/:idDelete chatbot + all data

Chat

POST/chatSend message, returns SSE stream (authenticated)
POST/widget/:botId/chatPublic widget chat โ€” no auth, rate limited by IP
GET/widget/:botId/configPublic widget config (colors, greeting, etc.)

Knowledge

GET/knowledgeList documents (filter by chatbot)
POST/knowledge/uploadUpload file (multipart), enqueues processing
POST/knowledge/scrapeScrape URL, enqueues processing
DELETE/knowledge/:idDelete doc + Qdrant vectors
GET/knowledge/:id/statusPoll processing status

Analytics

GET/analytics/overviewKPIs: messages, cost, users (range param)
GET/analytics/usageTime series: tokens + cost per day
GET/conversationsList conversations (filter, paginate)
GET/conversations/:id/messagesFull message transcript

Chat Endpoint

SSE streaming endpoint. RAG context injected before LLM call. Token usage tracked in real-time and written async to DB.

FastAPI โ€” apps/api/routers/chat.py

python
from fastapi import APIRouter, Depends, HTTPException
from fastapi.responses import StreamingResponse
from pydantic import BaseModel
from typing import AsyncIterator
import json, time

from ..middleware.auth import get_current_workspace
from ..services.llm import stream_chat
from ..services.rag import retrieve_context
from ..db.client import db

router = APIRouter(prefix="/chat", tags=["chat"])

class ChatRequest(BaseModel):
    chatbot_id: str
    session_id: str
    message: str
    history: list[dict] = []   # [{role, content}, ...]

@router.post("")
async def chat(
    req: ChatRequest,
    workspace = Depends(get_current_workspace)
):
    # 1. Load chatbot config
    bot = await db.fetchrow(
        "SELECT * FROM chatbots WHERE id=$1 AND workspace_id=$2",
        req.chatbot_id, workspace.id
    )
    if not bot:
        raise HTTPException(404, "Chatbot not found")

    # 2. RAG: retrieve relevant context
    context_chunks = await retrieve_context(
        query=req.message,
        chatbot_id=req.chatbot_id,
        workspace_id=workspace.id,
        top_k=5
    )

    # 3. Build messages array
    system = bot["system_prompt"]
    if context_chunks:
        ctx_text = "\n\n".join(c["text"] for c in context_chunks)
        system += f"\n\n--- KNOWLEDGE BASE ---\n{ctx_text}"

    messages = [
        {"role": "system", "content": system},
        *req.history,
        {"role": "user", "content": req.message}
    ]

    # 4. Stream response
    start_ms = time.time()

    async def event_stream() -> AsyncIterator[str]:
        full_text = ""
        total_tokens = 0
        cost_usd = 0.0

        async for chunk in stream_chat(
            model=bot["model"],
            messages=messages,
            temperature=bot["temperature"],
            max_tokens=bot["max_tokens"],
            workspace_id=workspace.id
        ):
            if chunk.type == "text":
                full_text += chunk.text
                yield f"data: {json.dumps({'text': chunk.text})}\n\n"
            elif chunk.type == "usage":
                total_tokens = chunk.total_tokens
                cost_usd = chunk.cost_usd

        # 5. Persist async (don't block response)
        latency_ms = int((time.time() - start_ms) * 1000)
        await db.execute("""
            INSERT INTO messages (conversation_id, role, content, tokens, sources, latency_ms)
            VALUES ((SELECT id FROM conversations WHERE session_id=$1 LIMIT 1),
                    'assistant', $2, $3, $4, $5)
        """, req.session_id, full_text, total_tokens,
            json.dumps(context_chunks), latency_ms)

        await db.execute("""
            INSERT INTO usage_events (workspace_id, event_type, tokens_used, cost_usd, model)
            VALUES ($1, 'chat_message', $2, $3, $4)
        """, workspace.id, total_tokens, cost_usd, bot["model"])

        yield f"data: {json.dumps({'done': True, 'tokens': total_tokens, 'cost': cost_usd})}\n\n"

    return StreamingResponse(
        event_stream(),
        media_type="text/event-stream",
        headers={"Cache-Control": "no-cache", "X-Accel-Buffering": "no"}
    )

LiteLLM Service โ€” services/llm.py

python
import litellm
from .encryption import decrypt_key
from ..db.client import db

# Cost per 1M tokens (input, output)
MODEL_COSTS = {
    "gpt-4o":            (5.00, 15.00),
    "gpt-4o-mini":       (0.15,  0.60),
    "claude-3-5-sonnet": (3.00, 15.00),
    "claude-3-haiku":    (0.25,  1.25),
    "gemini-1.5-pro":    (3.50, 10.50),
    "gemini-flash":      (0.075, 0.30),
}

async def stream_chat(model, messages, temperature, max_tokens, workspace_id):
    # Fetch decrypted API key for this workspace + provider
    provider = model.split("-")[0]  # 'gpt' โ†’ 'openai', 'claude' โ†’ 'anthropic'
    provider_map = {"gpt": "openai", "claude": "anthropic", "gemini": "google"}
    provider_name = provider_map.get(provider, provider)

    row = await db.fetchrow(
        "SELECT api_key_enc FROM ai_providers WHERE workspace_id=$1 AND provider=$2",
        workspace_id, provider_name
    )
    api_key = decrypt_key(row["api_key_enc"])

    response = await litellm.acompletion(
        model=model,
        messages=messages,
        temperature=temperature,
        max_tokens=max_tokens,
        api_key=api_key,
        stream=True
    )

    total_in = total_out = 0
    async for chunk in response:
        delta = chunk.choices[0].delta
        if delta.content:
            yield type("C", (), {"type": "text", "text": delta.content})()
        if hasattr(chunk, "usage") and chunk.usage:
            total_in = chunk.usage.prompt_tokens
            total_out = chunk.usage.completion_tokens

    ci, co = MODEL_COSTS.get(model, (5, 15))
    cost = (total_in * ci + total_out * co) / 1_000_000
    yield type("U", (), {"type": "usage", "total_tokens": total_in + total_out, "cost_usd": cost})()

Next.js โ€” Consuming the stream (TypeScript)

typescript
// components/chatbot/ChatPreview.tsx
export async function sendMessage(
  chatbotId: string,
  message: string,
  history: Message[],
  onChunk: (text: string) => void,
  onDone: (usage: Usage) => void
) {
  const res = await fetch(`${API_URL}/chat`, {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "Authorization": `Bearer ${await getAccessToken()}`,
    },
    body: JSON.stringify({ chatbot_id: chatbotId, session_id: getSessionId(), message, history }),
  });

  const reader = res.body!.getReader();
  const decoder = new TextDecoder();

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    const lines = decoder.decode(value).split("\n").filter(l => l.startsWith("data: "));
    for (const line of lines) {
      const payload = JSON.parse(line.slice(6));
      if (payload.text) onChunk(payload.text);
      if (payload.done) onDone({ tokens: payload.tokens, cost: payload.cost });
    }
  }
}

RAG Pipeline

Document ingestion โ†’ chunking โ†’ embedding โ†’ Qdrant storage. Retrieval at chat time injects top-K relevant chunks into the system prompt.

Ingestion Worker โ€” workers/tasks.py

python
from celery import Celery
from qdrant_client import QdrantClient
from qdrant_client.models import PointStruct, Distance, VectorParams
from openai import AsyncOpenAI
import pdfplumber, re, uuid

app = Celery("tasks", broker="redis://localhost:6379")
qdrant = QdrantClient(url="https://your-qdrant.cloud", api_key="...")
openai = AsyncOpenAI()

def chunk_text(text: str, chunk_size=256, overlap=32) -> list[str]:
    """Split text into overlapping token-approximate chunks."""
    words = text.split()
    chunks, i = [], 0
    while i < len(words):
        chunk = " ".join(words[i : i + chunk_size])
        chunks.append(chunk)
        i += chunk_size - overlap
    return [c for c in chunks if len(c.strip()) > 20]

async def embed_texts(texts: list[str]) -> list[list[float]]:
    res = await openai.embeddings.create(
        model="text-embedding-3-small", input=texts
    )
    return [e.embedding for e in res.data]

def ensure_collection(workspace_id: str):
    col = f"ws_{workspace_id.replace('-','_')}"
    if col not in [c.name for c in qdrant.get_collections().collections]:
        qdrant.create_collection(col, vectors_config=VectorParams(
            size=1536, distance=Distance.COSINE
        ))
    return col

@app.task(bind=True, max_retries=3)
def ingest_document(self, doc_id: str, workspace_id: str, chatbot_id: str,
                    file_path: str, source_type: str):
    import asyncio
    asyncio.run(_ingest(doc_id, workspace_id, chatbot_id, file_path, source_type))

async def _ingest(doc_id, workspace_id, chatbot_id, file_path, source_type):
    from ..db.client import db

    await db.execute(
        "UPDATE knowledge_documents SET status='processing' WHERE id=$1", doc_id
    )
    try:
        # 1. Extract text
        if source_type == "pdf":
            with pdfplumber.open(file_path) as pdf:
                text = "\n".join(p.extract_text() or "" for p in pdf.pages)
        elif source_type == "url":
            import trafilatura
            downloaded = trafilatura.fetch_url(file_path)
            text = trafilatura.extract(downloaded) or ""
        else:
            text = open(file_path).read()

        # 2. Chunk
        chunks = chunk_text(re.sub(r'\s+', ' ', text))

        # 3. Embed (batch of 100)
        all_embeddings = []
        for i in range(0, len(chunks), 100):
            batch = await embed_texts(chunks[i:i+100])
            all_embeddings.extend(batch)

        # 4. Upsert to Qdrant
        col = ensure_collection(workspace_id)
        points = [
            PointStruct(
                id=str(uuid.uuid4()),
                vector=emb,
                payload={
                    "text": chunk,
                    "doc_id": doc_id,
                    "chatbot_id": chatbot_id,
                    "workspace_id": workspace_id,
                    "chunk_index": i,
                }
            )
            for i, (chunk, emb) in enumerate(zip(chunks, all_embeddings))
        ]
        qdrant.upsert(collection_name=col, points=points)

        await db.execute(
            "UPDATE knowledge_documents SET status='indexed', chunk_count=$1 WHERE id=$2",
            len(chunks), doc_id
        )
    except Exception as e:
        await db.execute(
            "UPDATE knowledge_documents SET status='error', error_msg=$1 WHERE id=$2",
            str(e), doc_id
        )

Retrieval โ€” services/rag.py

python
from qdrant_client import QdrantClient
from qdrant_client.models import Filter, FieldCondition, MatchValue
from openai import AsyncOpenAI

qdrant = QdrantClient(url="https://your-qdrant.cloud", api_key="...")
openai = AsyncOpenAI()

async def retrieve_context(
    query: str,
    chatbot_id: str,
    workspace_id: str,
    top_k: int = 5,
    score_threshold: float = 0.72
) -> list[dict]:
    # Embed the query
    res = await openai.embeddings.create(
        model="text-embedding-3-small", input=query
    )
    q_vector = res.data[0].embedding

    col = f"ws_{workspace_id.replace('-','_')}"

    # Search with chatbot_id filter
    results = qdrant.search(
        collection_name=col,
        query_vector=q_vector,
        limit=top_k,
        score_threshold=score_threshold,
        query_filter=Filter(must=[
            FieldCondition(key="chatbot_id", match=MatchValue(value=chatbot_id))
        ])
    )

    return [
        {"text": r.payload["text"], "score": round(r.score, 3), "doc_id": r.payload["doc_id"]}
        for r in results
    ]

Embed Widget

Vanilla JS, zero dependencies, ~6KB gzipped. Injected via a single <script> tag. Self-contained shadow DOM to prevent CSS leakage.

javascript
// widget/src/widget.ts โ€” compiled to widget/dist/widget.js
(function() {
  const config = window.ModelPilotConfig || {};
  const BOT_ID = config.botId || document.currentScript.dataset.botId;
  const API  = "https://api.modelpilot.ai/v1";
  let sessionId = localStorage.getItem("mp_session");
  if (!sessionId) {
    sessionId = crypto.randomUUID();
    localStorage.setItem("mp_session", sessionId);
  }

  // Fetch widget config from API
  async function init() {
    const res = await fetch(`${API}/widget/${BOT_ID}/config`);
    const cfg = await res.json();
    render(cfg);
  }

  function render(cfg) {
    const host = document.createElement("div");
    const shadow = host.attachShadow({ mode: "closed" });
    document.body.appendChild(host);

    shadow.innerHTML = `
      <style>
        :host { all: initial; font-family: system-ui; }
        #launcher {
          position: fixed; ${cfg.position === "bottom-left" ? "left" : "right"}: 20px;
          bottom: 20px; width: 52px; height: 52px; border-radius: 50%;
          background: ${cfg.accentColor}; cursor: pointer; display: flex;
          align-items: center; justify-content: center; font-size: 24px;
          box-shadow: 0 4px 20px rgba(0,0,0,0.18); z-index: 999999;
          border: none; transition: transform .2s;
        }
        #launcher:hover { transform: scale(1.08); }
        #window {
          position: fixed; ${cfg.position === "bottom-left" ? "left" : "right"}: 20px;
          bottom: 82px; width: 360px; height: 560px; border-radius: 18px;
          background: #fff; box-shadow: 0 12px 48px rgba(0,0,0,0.18);
          display: none; flex-direction: column; overflow: hidden; z-index: 999998;
        }
        #window.open { display: flex; }
        #header { background: ${cfg.accentColor}; padding: 14px 16px;
          color: white; font-weight: 700; font-size: 14px;
          display: flex; align-items: center; gap: 10px; }
        #messages { flex: 1; overflow-y: auto; padding: 14px;
          display: flex; flex-direction: column; gap: 10px; background: #f7f8fc; }
        .msg { max-width: 82%; padding: 10px 14px; border-radius: 12px;
          font-size: 13.5px; line-height: 1.5; }
        .user { align-self: flex-end; background: ${cfg.accentColor}; color: white;
          border-radius: 12px 12px 2px 12px; }
        .bot { align-self: flex-start; background: white; color: #09090b;
          border-radius: 12px 12px 12px 2px; box-shadow: 0 1px 4px rgba(0,0,0,.08); }
        #input-row { display: flex; gap: 8px; padding: 10px; border-top: 1px solid #f0f0f0; }
        #input { flex: 1; border: 1px solid #e4e4e7; border-radius: 9px;
          padding: 8px 12px; font-size: 13px; outline: none; }
        #send { background: ${cfg.accentColor}; color: white; border: none;
          border-radius: 9px; padding: 8px 14px; cursor: pointer; font-weight: 700; }
      </style>
      <button id="launcher">๐Ÿ’ฌ</button>
      <div id="window">
        <div id="header">๐Ÿค– ${cfg.botName || "Assistant"}</div>
        <div id="messages">
          <div class="msg bot">${cfg.greeting || "Hi! How can I help?"}</div>
        </div>
        <div id="input-row">
          <input id="input" placeholder="Type a messageโ€ฆ" />
          <button id="send">โ†‘</button>
        </div>
      </div>`;

    const launcher = shadow.getElementById("launcher");
    const win      = shadow.getElementById("window");
    const msgs     = shadow.getElementById("messages");
    const input    = shadow.getElementById("input");
    const send     = shadow.getElementById("send");
    let history = [];

    launcher.onclick = () => win.classList.toggle("open");

    async function sendMessage() {
      const text = input.value.trim();
      if (!text) return;
      input.value = "";

      addMsg("user", text);
      history.push({ role: "user", content: text });

      const botEl = addMsg("bot", "โ–Œ");  // streaming cursor
      let full = "";

      const res = await fetch(`${API}/widget/${BOT_ID}/chat`, {
        method: "POST",
        headers: { "Content-Type": "application/json" },
        body: JSON.stringify({ session_id: sessionId, message: text, history })
      });

      const reader = res.body.getReader();
      const dec = new TextDecoder();

      while (true) {
        const { done, value } = await reader.read();
        if (done) break;
        dec.decode(value).split("\n")
          .filter(l => l.startsWith("data: "))
          .forEach(l => {
            const p = JSON.parse(l.slice(6));
            if (p.text) { full += p.text; botEl.textContent = full + "โ–Œ"; }
            if (p.done) { botEl.textContent = full; }
          });
        msgs.scrollTop = msgs.scrollHeight;
      }
      history.push({ role: "assistant", content: full });
    }

    function addMsg(role, text) {
      const el = document.createElement("div");
      el.className = "msg " + role;
      el.textContent = text;
      msgs.appendChild(el);
      msgs.scrollTop = msgs.scrollHeight;
      return el;
    }

    send.onclick = sendMessage;
    input.onkeydown = e => e.key === "Enter" && sendMessage();
  }

  init();
})();

Auth & Multi-tenancy

Supabase handles JWT issuance and OAuth. FastAPI verifies JWTs and injects workspace context. Row Level Security on every table enforces isolation at the DB layer.

FastAPI JWT middleware โ€” middleware/auth.py

python
from fastapi import Header, HTTPException, Depends
from jose import jwt, JWTError
import os
from ..db.client import db

SUPABASE_JWT_SECRET = os.environ["SUPABASE_JWT_SECRET"]

class WorkspaceCtx:
    user_id: str
    workspace_id: str
    role: str

async def get_current_workspace(
    authorization: str = Header(...)
) -> WorkspaceCtx:
    try:
        token = authorization.removeprefix("Bearer ")
        payload = jwt.decode(token, SUPABASE_JWT_SECRET, algorithms=["HS256"])
        user_id = payload["sub"]
    except JWTError:
        raise HTTPException(401, "Invalid token")

    # Load workspace membership (cached in Redis 60s)
    row = await db.fetchrow("""
        SELECT wm.workspace_id, wm.role, w.plan, w.message_quota
        FROM workspace_members wm
        JOIN workspaces w ON w.id = wm.workspace_id
        WHERE wm.user_id = $1 AND wm.joined_at IS NOT NULL
        ORDER BY wm.joined_at LIMIT 1
    """, user_id)

    if not row:
        raise HTTPException(403, "No workspace found")

    ctx = WorkspaceCtx()
    ctx.user_id = user_id
    ctx.workspace_id = str(row["workspace_id"])
    ctx.role = row["role"]
    ctx.plan = row["plan"]
    ctx.message_quota = row["message_quota"]
    return ctx

Supabase RLS Policies โ€” SQL

sql
-- Enable RLS on all tables
ALTER TABLE chatbots           ENABLE ROW LEVEL SECURITY;
ALTER TABLE knowledge_documents ENABLE ROW LEVEL SECURITY;
ALTER TABLE conversations       ENABLE ROW LEVEL SECURITY;
ALTER TABLE messages            ENABLE ROW LEVEL SECURITY;
ALTER TABLE usage_events        ENABLE ROW LEVEL SECURITY;

-- Helper function: get user's workspace IDs
CREATE OR REPLACE FUNCTION auth.workspace_ids()
RETURNS uuid[] LANGUAGE sql STABLE AS $$
  SELECT array_agg(workspace_id)
  FROM workspace_members
  WHERE user_id = auth.uid() AND joined_at IS NOT NULL;
$$;

-- Chatbots: members can read, editors/admins can write
CREATE POLICY chatbots_select ON chatbots
  FOR SELECT USING (workspace_id = ANY(auth.workspace_ids()));

CREATE POLICY chatbots_insert ON chatbots
  FOR INSERT WITH CHECK (
    workspace_id IN (
      SELECT workspace_id FROM workspace_members
      WHERE user_id = auth.uid() AND role IN ('editor','admin')
    )
  );

CREATE POLICY chatbots_update ON chatbots
  FOR UPDATE USING (
    workspace_id IN (
      SELECT workspace_id FROM workspace_members
      WHERE user_id = auth.uid() AND role IN ('editor','admin')
    )
  );

-- Conversations: members can view only their workspace
CREATE POLICY conversations_select ON conversations
  FOR SELECT USING (workspace_id = ANY(auth.workspace_ids()));

-- Admins only: billing + provider keys
CREATE POLICY providers_admin ON ai_providers
  FOR ALL USING (
    workspace_id IN (
      SELECT workspace_id FROM workspace_members
      WHERE user_id = auth.uid() AND role = 'admin'
    )
  );

Next.js Auth middleware โ€” middleware.ts

typescript
import { createMiddlewareClient } from "@supabase/auth-helpers-nextjs";
import { NextResponse } from "next/server";
import type { NextRequest } from "next/server";

export async function middleware(req: NextRequest) {
  const res  = NextResponse.next();
  const supabase = createMiddlewareClient({ req, res });
  const { data: { session } } = await supabase.auth.getSession();

  const isAuthPage = req.nextUrl.pathname.startsWith("/login");

  if (!session && !isAuthPage) {
    return NextResponse.redirect(new URL("/login", req.url));
  }
  if (session && isAuthPage) {
    return NextResponse.redirect(new URL("/", req.url));
  }
  return res;
}

export const config = {
  matcher: ["/((?!_next/static|_next/image|favicon|widget.js).*)"`]
};

Deployment

Frontend on Vercel, backend on Railway. Widget JS hosted on Cloudflare R2 for <50ms global delivery. All services deploy on git push.

Vercel
Next.js frontend ยท Edge network ยท Auto-SSL
Railway
FastAPI ยท Celery ยท Redis ยท Auto-deploy
Supabase
Postgres ยท Auth ยท Storage ยท Realtime
Qdrant Cloud
Managed vector DB ยท 1GB free tier

docker-compose.yml (Local Dev)

yaml
version: "3.9"
services:
  redis:
    image: redis:7-alpine
    ports: ["6379:6379"]

  qdrant:
    image: qdrant/qdrant:latest
    ports: ["6333:6333"]
    volumes: ["./qdrant_storage:/qdrant/storage"]

  api:
    build: ./apps/api
    ports: ["8000:8000"]
    env_file: .env
    depends_on: [redis, qdrant]
    command: uvicorn main:app --host 0.0.0.0 --port 8000 --reload

  worker:
    build: ./apps/api
    env_file: .env
    depends_on: [redis]
    command: celery -A workers.tasks worker --loglevel=info

Security Checklist

โœ“ Implement
AES-256 encrypt all API keys at rest
Row Level Security on all Supabase tables
Rate limiting per workspace (Redis)
Widget CORS: allow any origin (public)
API CORS: restrict to your domains only
Stripe webhook signature verification
Sentry for error monitoring (free tier)
HTTPS everywhere (Vercel + Railway enforce)
โš  Don't Forget
Never log API keys or tokens to console
Monthly budget cap to prevent runaway costs
Input sanitization before sending to LLM
Rotate SUPABASE_JWT_SECRET quarterly
Soft-delete chatbots (don't destroy data)
Celery task idempotency (avoid duplicate chunks)

Environment Variables

bash
# .env.example

# Supabase
SUPABASE_URL=https://xxxx.supabase.co
SUPABASE_ANON_KEY=eyJ...
SUPABASE_SERVICE_KEY=eyJ...          # Never expose to client
SUPABASE_JWT_SECRET=your-jwt-secret

# Qdrant
QDRANT_URL=https://xxxx.qdrant.io
QDRANT_API_KEY=...

# OpenAI (for embeddings only)
OPENAI_API_KEY=sk-...

# Encryption key for stored provider keys (32 bytes)
ENCRYPTION_KEY=base64-encoded-32-byte-key

# Redis
REDIS_URL=redis://localhost:6379

# Stripe
STRIPE_SECRET_KEY=sk_live_...
STRIPE_WEBHOOK_SECRET=whsec_...
STRIPE_PRICE_ID_STARTER=price_...
STRIPE_PRICE_ID_PRO=price_...

# App
NEXT_PUBLIC_API_URL=https://api.modelpilot.ai/v1
NEXT_PUBLIC_SUPABASE_URL=https://xxxx.supabase.co
NEXT_PUBLIC_SUPABASE_ANON_KEY=eyJ...